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Abstract 



We present the detailed process of converting the classical Fourier Transform algorithm into 
the quantum one by using QR decomposition. This provides an example of a technique for 
building quantum algorithms using classical ones. The Quantum Fourier Transform is one of 
the most important quantum subroutines known at present, used in most algorithms that have 
exponential speed up compared to the classical ones. We briefly review Fast Fourier Transform 
and then make explicit all the steps that led to the quantum formulation of the algorithm, 
generalizing Coppersmith's work. 
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1. Introduction 

X 

For many years the spectral analysis of sampled data over a finite range, referred as the 
Discrete Fourier Transform (DFT) was performed directly on computers, using 0(N 2 ) 
operations, where N is the number of data points. A great milestone in Fourier analysis 
was the paper published in 1965 by Cooley and Tukey [1], where they described the 
so-called Fast Fourier Transform (FFT) algorithm, which can compute the DFT with 
only 0(N log N) operations. In the next year, Rudnick presented a computer program 
which also required 0(N log N) operations [2], being inspired by an earlier method due 
to Danielson and Lanczos [3]. 
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Although the FFT algorithm may seem reasonably fast for classical Computer Science, 
it turns out that its quantum version can provide exponentially faster algorithms for 
some problems. In 1994, Shor [4] developed a quantum version of the Fourier Transform 
when the prime factors of N are not large (smaller than logiV). Motivated by Shor's 
work, Coppersmith [5] developed the quantum version of the FFT when N is a power 
of two, which required only (9(log N) operations. In the same year Cleve [6], using a 
recursive approach, has also shown how to implement the Fourier Transform in quantum 
computers. 1 This is, with no doubt, the most important quantum sub-routine designed 
so far. Most quantum algorithms with exponential improvement when compared to the 
classical counterparts use the Quantum Fourier Transform (QFT) as an essential part. 
These algorithms solve instances of the Hidden Subgroup Problem (HSP), which has 
several important applications. A good survey on the HSP can be found in [8]. 

The quantum algorithm for an approximate QFT was developed by Coppersmith [5], 
with complexity 0(m\ogN), where 1 < m < logN is the parameter that defines the de- 
gree of approximation, usually taken as 0(loglog N). Indeed, this approximate algorithm 
is the one with practical applications. The exact transform yields a level of accuracy that, 
in most of the cases, cannot even be detected by the measuring device. Barenco et at [9] 
showed that, in the presence of decoherence, i.e., in realistic situations, the approximate 
transform may achieve better results than the exact one. They also showed that the ac- 
curacy of the exact QFT can be achieved by applying the approximate QFT repeatedly 
O f 1 -^) times. 



In this paper we build the QFT algorithm from the classical FFT, generalizing Copper- 
smith's work. In order to accomplish this generalization, we use the QR decomposition. 
This technique may provide insight for the development of new quantum algorithms. 

In Sec. 2 we review the definition of DFT and the classical FFT algorithm. We also fix 
the notation that will be used in the generalization of the quantum algorithm. In Sec. 3 
we perform the decomposition of the matrix form of FFT algorithm. QR decomposition 
plays an essential role in this task. The resulting unitary matrices can be interpreted as 
quantum logic gates. In Sec. 4 we summarize the results obtained in the previous sections 
and show how to build the circuit for the exact QFT. 

2. The Discrete Fourier Transform 

In this section we address the DFT in a form that will be useful later on. It is important 
to establish some notations that will be used throughout this paper. Let n be a positive 
integer, and let a,c be n-bit integers. The binary representations of a and c will be 
denoted, respectively, by 

(a n -ia n -2 ■ ■ ■ 00)2 and (c„_ic„_ 2 • • • 00)2, 
such that, 

n — 1 7i—l 

a = y^ a j 2 J and c = N^ Cj 2 J . 
3=0 j=o 



1 Nielsen and Chuang [7] also mention an unpublished work on this subject by David Deutsch. 
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Binary representations of numbers may have the subscript (-)2 omitted when context is 
clear. All logarithms are base two. Let X and Y be arrays 2 of N = 2" complex numbers. 
The integers a and c defined above will be useful for indexing X and Y. The notation 
used in this paper for that indexing is X a . Let u = ujn = exp(27ri/iV), be the TV-root of 
unity. 

Finally, we may define the DFT. 
Definition 1 (DFT) The Discrete Fourier Transform takes as input a complex array 
X , and converts it into a complex array Y , according to the expression 

N-l 

Y * - ^ E *■«". (i) 

Viv a=0 

This equation may be rewritten by making explicit the binary representation of a and 
c. We concentrate only on the sub-term uj ac . 

= exp(^ ]T a jCk 2^\. (2) 



N 

0<j,k<n-l 



Using a/ 2 ) = 1, we can state that 



j(23+k) = (2 » )a *+*-» 



1, (3) 



if j + k > n. Therefore, it is possible to eliminate terms from Eq. (2) when j + k > n, 
leading us to an equivalent definition of the DFT, 

Y c = -L= £ X a cJ 2 -^ £ -J^A. (4) 

V 0<a<Af-l \ 0<j,fe<n-l / 

j-\-k<n— 1 

Coppersmith [5] observed that we may define a more general transform by changing 
the range of j' + fc in Eq. (4). If we take n — m < j + k < n— 1, where to is a parameter such 
that 1 < m < n, we obtain the definition of the Approximate Fourier Transform. When 
to = n, we recover the definition of the exact DFT. When m = 1, we achieve the a slighty 
modified definition of the well known Hadamard transform — with a different indexing of 
the elements in the array. The argument in the exponent of this new transform differs 
from that in the exponent of DFT just by 

- = 7T E a ^ +k - (5) 

0<j,k<n-l 
j-\-k<n—m 

It is not difficult to show that the magnitude |e| of that error in Eq. (5) is bounded 
by 2irn2~ m . Both the approximate and the exact Fourier Transforms may be expressed 
by a matrix operation over the array X. The entries of the corresponding matrices only 
differ by a multiplicative factor exp (ie), where |e| < 2irn2~ m . Therefore, the approximate 
transform is able to provide results very close to the exact Fourier Transform, even if 
the parameter m is not so close to n, because the error decreases exponentially as m 



2 Some readers may prefer to see X and Y as functions, respectively X : Zjv — ► C, and Y : Zjv 
where N = 2". 



increases. The minimum m required to ensure an error not greater than |e maa; | is given 

by 

m = log r + log log A. (6) 

l^max | 

Thus, the Approximate Fourier Transform provides results with a fixed tolerance |e m aa;| 
by using a parameter m that barely increases with the size of the input. 

We now briefly review the classical FFT [10, ff] via Daniclson Lanczos lemma. Daniel- 
son Lanczos lemma [3] allows the development of a recursive divide- and-conquer scheme 
to calculate the DFT. According to this lemma, Eq. (f) can be split into two parts. 
These parts of length A/2 can be calculated by employing again Danielson Lanczos 
lemma, leading to the calculation of four new Fourier Transforms, of length A/4 each. 
This process repeats itself n times until hitting arrays of length one, for which the Fourier 
Transforms are trivially found — in Eq. (1), when N = 1, we have Y = A. This is the 
idea behind Algorithm 1. Here, we denote by X^ s ' the state of a vector A in a step s of 
the computation. 

Algorithm 1 Classical FFT 

Require: This algorithm receives as input a vector A e C 2 . 
Ensure: The output is a vector Y € C 2 which is the DFT of vector A. 
l: for all a such that < a < 2 n — 1 do {Initialization, step n} 



^ Ct X (a n -ia n - 2 ...ao)2 *~ ^{o»-m»-2...«o)2 ■ 

for s from n — 1 to 0, downward do {Step s} 
for all < bn-i, ■ ■ ■ ,b s ,a s -i, . . . ,a < 1 do 



A, ( f } L , ^A (s+1) 



(6 n _i...6 s a s _i...o )2 /o (6 n -i-"6»+iOo,_i...oo)2 

+ _L w (6.6.+i...6»-iO...OW(.+l) (7) 

6: for all b such that < b < 2 n — 1 do {Re-ordering} 



7: set Y( h „ ,t ,...j,„i <— A 



("] 



(i>n-l6n-2— 60)2 ^~ ^(6 fcl---b re -l)2' 



The complexity of the FFT can be easily computed from the above considerations. If 
we denote by T%n the approximate number of steps of a DFT on an array of length 2 n , 
we may write T 2 » = 2T 2 »-i + 2", for n > 1, with T x = 0. Then, T 2 « = n2™, which means 
that the FFT algorithms have complexity 0(n2 n ) or, cquivalently, O(NlogN). 

3. Obtaining QFT from FFT 

If we treat A^ s ) as column vectors, it becomes clear that Algorithm 1 may be expressed 
as a matrix operation, such that 

x{ / ] - E P $4 S+1) , (8) 

0<fc<iV-l 

where P^ is a A x A matrix for < s < n — 1. The indices of vectors, as well as the 
rows and columns of the matrices will be numbered from to A — 1 in this paper. 



By observing Algorithm 1, we see in Eq. (7) that each row of X^ s ' depends only on 
two rows of X( s+1 > . Hence, each row of the matrices P( s > have only two nonzero entries. 
They are located on columns k = j — j s 2 s and k = j + (l—j s )2 s , where j s is the (s + l)-th 
bit of j (counting from the least to the most significant bit). Thus, each nonzero entry of 
matrices p( s ' will be placed only in the main diagonal, or in a subdiagonal, depending 
of the value of the bit j s of j. 

When k = j, we have the entries of the main diagonal of the matrices. If j s = 0, then 
the first term on the right hand side of Eq. (7) shows us that these entries are A=. If 

j s = 1, then the second term on the right hand side of Eq. (7) shows us that the entries 
are -4= w O'.Ja+i-Jn-i0...o) 2 _ 

V2 

When k = j — 2 s , we have the entries of the lower subdiagonal of the matrices. If 
j s = 0, then the entries are 0, according to Eq. (7). If j s — 1, then the first term on the 
right hand side of Eq. (7) shows us that the entries are -4=. 

When k = j + 2 s , we have the entries of the upper subdiagonal of the matrices. If 
j s = 0, then the second term on the right hand side of Eq. (7) shows us that the entries 
are -j=ujUeJs+i---j n -iO ...o) 2 _ \{ j s — l ; then the entries are 0, according to Eq. (7). Before 

we express these ideas mathematically let us introduce additional notation. 
If we define a binary fraction as 



0-j = 0-jojl---jn-l 



Jt 



2-^i 2 t+1 ' 

0<t<n-l 



(9) 



we may rewrite y(j«J«+i"-Jn.-i -" )2 as w (°-i) 2 
We note also that 



Js = 



l-(-l) 



(10) 



The generic matrices P^ s ' can now be written as, 

( ,J e (0.j)2"+" 



p(«) _ 



1 



1 



(-i)W 



V2 



(_1)I>J 



,(0.i)2" 



lo. 



if k = j 

if k = j - 2 s 

if k = j + 2 s 
otherwise, 



(11) 



which represent the operations performed by the steps of Algorithm 1. 
Proposition 2 The matrices P^ s ' are unitary. 



PROOF. We may solve 



fp(s)p(sy\ 



jk 



V p {s) p 



(s) p(s)* 
kl ■ 



(12) 



Q<KN-l 



When k — j, we have 



fp(s)p(s) f \ 



When k 

p(s)p(a)' [ 



n 



pis) 

33 


pM 

33 


* , p(«) 


p« 


, p(s) p 






1 

2 + 


if 


i - (-1) L 


") 


2 

1 

+ 2 


f 1+ i- 


-i)W 


-J 


2 





1. 



(13) 



j — 2 s or fc = j + 2 s we may perform analogous calculation and obtain 



jk 



0. Therefore 



\ I jk 



n 



Since the matrices P^ s ' are unitary, they could in principle be implemented on a 
quantum computer. It is important, when developing a quantum algorithm, to express 
the unitary operators in terms of universal quantum gates, that is, controllcd-NOTs and 
gates acting on single qubits. 

In order to find the universal gates for the quantum version of the FFT algorithm, 
several decompositions may be applied to matrices P( s ' . One method that may be in- 
sightful in this process is the QR decomposition [12,11], which factors a generic matrix 
into the product of a unitary matrix M — orthogonal, if the matrix to be decomposed 
is real — and an upper triangular matrix N. In the case of matrices P^ s ' we may apply 
a slightly modified version of QR decomposition which yields orthogonal and diagonal 
matrices as factors. This can be done because matrices P( s ' have the following property: 
their columns are either real or multiple of real columns. There are at least three well 
known methods for computing the QR decomposition: Householder reflections, Givens 
rotations and Gram-Schmidt decomposition. The last one is particularly interesting in 
this case because it takes advantage of the property mentioned above. 

Observe that any column of matrices P^ s ' may be obtained by fixing a value of k in 
Eq. (11) and running j from to N — 1. We note that the columns of matrices P("> are 
already orthonormal. Multiplying each column k by 



» 



(-1)I>J< 



2"-fc s (0.fc)2"+ s 



(14) 



for k = j, k = j — 2 s and k = j + 2 s , and then multiplying the corresponding cases in 
Eq. (11) by ci£ , we obtain the orthogonal matrices 

f(-l)l>J, iffc=j 

l-(-l)l>J 

^ '- , if k = ] - 2 s 

(15) 

. iffc=j + 2 s 

otherwise. 



M$ = 



1 



1 



(-i)L*J 



o, 



The upper triangular matrices are 

'(-l)l>Jc 



< = 



,3s(0.i)2-" 



o. 



if k = j 
otherwise. 



(16) 



Now we confirm the decomposition. 



Proposition 3 For any step s and for any number of bits n we have 

pW=MW7VW, (17) 

where the matrices M^ s ' and N^- 8 ' are given by Eqs. (15) and (16), respectively. 



PROOF. Since matrices N^ are diagonal we have 
When k = j, 

33 33 <<2 

= p(s) 

33 ' 



(18) 



(19) 



When k = j — 2 s or k = j + 2 s we may perform analogous calculations, and easily 
obtain (AfWjVW) . fe = P$ . D 

Let us now analyze the structure of the matrices M^'. Starting with M^°\ we note 
that 

f(-l)- 7 ', if.7 = fc 

i-izll j , iffc = i _! 

(20) 

if k = j + 1 

otherwise. 



M 



(0) 



jfc 



1 
V2 



i + i-iy 



v0, 



It is easy to check that 



m: 



(0) _ r®(n-l) 

2"x2" ~~ J 



#, 



1 1 



(21) 



where / is the 2x2 identity matrix, and H = A= I | is the Hadamard matrix. 

V2 ^ _ 1 

Now, we prove that M 2 V x27 , 
generic matrix A: 



M, 



( s -i) 



.,. i /2 „-i -f- We use the following formula for a 
'A ■ u , iffc=j (mod 2) 



(A ® J) jfe 



(22) 



0, 



otherwise, 



which can be obtained by analyzing the entries of A® I. Replacing A by M^_J 2n _ t in 
Eq. (22) and simplifying the result, we get the right-hand side of Eq. (15). Thus 



M. 



(«) 



m^::^®/. 



Proposition 4 TTie matrices M^ s ' may be decomposed in tensorial products involving 
only 2x2 identities and a Hadamard matrix, such that 

M (s) = j®(n-«-l) jy j® s> ^4) 



PROOF. Using (23) recursively s times and replacing n by n — s in Eq. (21) we get 
Eq. (24). □ 

Let us now analyze the structure of the matrices N^ s ' . They are diagonal and the 
entries are one or ±w c , < c < N. The first attempt in decomposing them is to write 
N( s > for a fixed s as a product of diagonal matrices, the entries of which are one or ±lj c 
for a fixed c. We rewrite Eq. (16) using Eqs. (9) and (10) to obtain the form 

{71 — 1 
to ( 25 ) 

0, otherwise. 

Since n and s are fixed, the factors we are looking for are obtained by fixing t. So, for 
given n and s, we define the matrix 

*<•.*■«> = P'* 9 """' ' lik = j (26) 

I 0, otherwise, 

where j s and jt are given by Eq. (10). Since matrices i?( s '*>") are diagonal for arbitrary 
s,t and u, they commute. We can now state the following proposition. 
Proposition 5 The matrices N^' may be written as 

n-l 

iV (s) = J| R( s ^ u \ (27) 

t=s+l 

with u = t — S + 1, when s < n — 1. When s = n — 1, A^™ -1 ' = /. 
PROOF. Using (-IP 3 = w- 7 '^ 2 " -1 , we see that 

S 

(-i) j3 ijy-* 2 "^"'" 1 = i. (28) 

t=0 

Then, using this result in Eq. (25) we get Eq. (27) when s < n — 1. Replacing s by n — 1 
in Eq. (25) we get A^™- 1 ) = J. □ 

The structure of the matrices i?( s '*> u ) is the following: they are diagonal; the entries 
are one when either j s or j t is equal to zero; the entries arc to 1 = cxp f 2 ^?) when 
j s and jt are simultaneously equal to one. Therefore, the matrix i?( s '*^ u ) is a controlled 
operation 

« M = (1m\ , (29) 

with control on qubit s and target on qubit t (or vice- versa, in this case). This is a 
generalization of control gates acting on two qubits in the presence of more qubits. 
Summarizing, the QFT can be expressed as 

n-l 

F 2 n = AW Yl MWJV( S ), (30) 



where the matrices M^ and N^ are given by Eqs. (24) and (27) respectively. A^ is a 
2" x 2" matrix implementing the final swaps of the algorithm. 

4. Building the QFT from the Proposed Decomposition 

Based on the last section we may finally derive a quantum algorithm to compute 
the DFT. fn the beginning of classical FFT, we have a collection of complex numbers 

(n) 

X) >. These values now correspond to the quantum state 

(o n _io n _2---a ) * M 

|V>n} = J2 X (2L 1 a n - 2 -a„)\ a n-l a n-2---a a ). (31) 

0< a n _ i , a n — 2 , . . . , ao < 1 

This preparation of the quantum system corresponds to the initialization of the algorithm. 
Once the state has been prepared, we should apply the matrices P^ given by Eq. (11) 
in the following order: 

\^o)=P ( " ) P (1) ---P {n - 1) Hn)- (32) 

The state |V>o) is 

0<c„_i---c o <l 

with the coefficients labelled in the inverse order. 

Algorithm 2 QFT 

Require: This algorithm must receive as input a vector X <G C 2 . 

Ensure: The output is a quantum state \ip) whose amplitudes correspond to the elements 
of Y e C 2 " , given by the DFT of X. 

l: {Initialization, step n} 

2: prepare the state of the n-qubit quantum register as 

N-l 

|Vn>= E^l fc )' 
fe=0 



for s from n — 1 to 0, downward do {Step s} 
for t from n — 1 to s + 1 , downward do 

apply unitary operation Jj( s '^* _;s + 1 ) 
apply a Hadamard gate only on qubit s. 

for t from to [n/2\ — 1 do {Re-ordering} 
swap qubits t and n — t — 1. 



Although the matrices P^' are, in general, too complex to be directly realized in a 
physical experiment, it was shown that each of them can be decomposed into simpler 
matrices M^ s > and N^ s \ which in turn may be decomposed into gates acting on one or 
two qubits. 

In each step s, we must first apply the matrix W( s ). In step n-lwe have A^™ -1 ) = /. 
Eq. (27) shows us that each matrix A^ s ^ is a product of other simpler matrices. Hence, 
we must apply the logical gate R( s ' t > t - S + 1 ) — gate i?(* -s+1 ) with control on qubit s and 
target on qubit t — for each t starting from t = n — 1 and going downward until t = s + 1. 



AT(3) jtf<3) TV 


2) 


M( 2 ) Art 1 ) 




M(!> 


AT(O) 


M(°) 




° 1 t 1 1 tt 1 o ^ 
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H y ' 
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I 1 


I 1 H |-* — 



Fig. 1. A circuit for QFT over n 
starting from up to n — 1. 



4 qubits. Note that the qubits arc numbered from bottom to top, 



Then, we apply the matrix M^' , which corresponds to a Hadamard gate acting only on 
qubit s. After running s from n — 1 to 0, we must apply swaps to correct the order of 
the output. We compiled these steps in Algorithm 2. 

In Fig. 1 we represent the QFT circuit over four qubits in terms of Hadamard, con- 
trolled gates i?(") and swap operations. Note that matrices M^ and N ^ are also shown 
on the top. An alternative presentation of the circuit may be obtained by interchanging 
some of the gates that do not involve operations on the same qubits. 

Now we address the computational complexity of Algorithm 2. We assume that the 
initialization is done in negligible time — which is a reasonable assumption, since in most 
applications known so far the QFT is applied on a state of the computational basis, or 
on the output of some earlier step of some algorithm. In the main part of the algorithm 
(the outer loop) we have n(n + l)/2 steps. Therefore, this part of the algorithm is 0(n 2 ). 
The last part of the algorithm consists only on 0(n) swap operations. We conclude that 
the QFT has complexity 0(n 2 ) or, equivalently, 0(log TV), in terms of one and two 
qubit operations. It is quite simple to show that this complexity does not change when 
calculated in terms of universal gates. One just needs to recall that a controlled gate can 
be decomposed into two CNOTs and three one-qubit gates, and that a swap gate can be 
decomposed into three CNOTs. 

As an intermediate step before deducing the Approximate QFT, we may consider a 
classical algorithm for the Approximate FFT. This is quite similar to Algorithm 1, the 
only difference being that instead of u) ( b s---bn-i0...o) 2 m ^g seconcl term on the right 
hand side of Eq. (7), we have aj( bs - b min( S +m-i, re -i)0...o)2_ gy repeating all the process of 
finding the generic matrices P^ and decomposing them, we find out that this difference 
reflects only on Eq. (27) — in the approximate algorithm the productory ranges from 
t — s + 1 to min(s + m — l,n — 1). Analogously to the exact algorithm, we may check 
that the Approximate QFT has complexity 0(mn) or, equivalently, 0(m log N). In fact, 
the approximate version of QFT is not only simpler and faster, but also leads to more 
precise results in the presence of decoherence than its exact counterpart [9] . 

5. Conclusions 

The QFT algorithm represents an important improvement in the complexity of the 
classical algorithm, from O(NlogN) to 0(log N). The acceleration provided by Ap- 
proximate QFT algorithm goes even further, as 0(m log A), where m is a parameter 



10 



that can be taken as O(loglog-ZV). A practical difference between the classical and the 
quantum FFT is that the latter provides the result of the calculation as a superposition 
of quantum states, which cannot be directly read according to the postulates of quan- 
tum mechanics. However, a remarkable example showing the advantage of the quantum 
version of FFT is the quantum algorithm for factorization of large integers [4] . In this al- 
gorithm, the quantum FFT subroutine plays an essential role on the exponential speed-up 
over the best classical algorithms for integer factorization. 

In this paper, the building process of the QFT was exposed in detail, generalizing 
Coppersmith's work. We started from the description of the classical FFT and then 
obtained generic unitary matrices for each step of the algorithm. In order to get matrices 
simple enough to represent feasible quantum operations, those generic matrices were 
factored according to QR decomposition and the formulation in terms of one- and two- 
qubit gates was obtained. The complexity of this particular algorithm does not change 
when expressed in terms of universal gates. 

We argued that the Approximate QFT is also reobtained by the method proposed here, 
with analogous calculations. Depending on the chosen parameter, the number of matrices 
generated by the decomposition of the approximate algorithm can be considerably lower 
than that of the exact one. This reflects on the complexity of the quantum algorithm. 

The deduction of the algorithms addressed in this paper is different from previous 
works, and we hope it may provide insight for the development of new efficient quantum 
algorithms. As future directions, we are interested in checking whether other classical 
algorithms can be analysed according to this technique, starting from its matrix form 
and then decomposing it until simpler unitary matrices are obtained. 
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